perm filename CIRCUM[F83,JMC]2 blob sn#729843 filedate 1983-11-03 generic text, type T, neo UTF8
.<<circum[f83,jmc]>>
.require "memo.pub[let,jmc]" source;
.double space
.cb AI Applications of Circumscription

	(McCarthy 1980) introduces the circumscription method of
non-monotonic reasoning and gives motivation, some mathematical
properties and some examples of its application.  We assume acquaintance
with its ideas.

	The present paper gives a slightly generalized form of
circumscription and applications to representing common sense facts in a data
base.  It turns out that many such applications can be done in a uniform
way.  A single predicate ⊗ab, standing for "abnormal" is circumscribed
with all other predicates and functions considered as variables that
can be constrained to achieve the circumscription subject to the axioms.
Whether other applications require circumscribing other formulas than
⊗ab_z is presently unknown.

	We begin with our generalized circumscription.
.skip 1

%3Definition:%1 Let  %2A(P)%1 be a formula of second order logic involving
a tuple ⊗P of free predicate symbols.  Let ⊗E(P,x)  be a wff in which
⊗P  and a tuple ⊗x of individual variables occur free.  The circumscription
of  ⊗E(P,x)  relative to  ⊗A(P)  is the formula  ⊗A'(P)  defined by

!!a1:	%2A(P) ∧ ∀P'.[A(P') ∧ [∀x.E(P',x) ⊃ E(P,x)] ⊃ [∀x.E(P',x) ≡ E(P,x)]].%1

[We are here writing ⊗A(P) instead of %2A(P%41%*,_._._._,P%4n%*)%1 for brevity
and likewise writing ⊗E(P,x) instead of
%2E(P%41%*,_._._._,P%4n%*,x%41%*,_._._._,x%4n%*)%1.
Likewise the quantifier ⊗∀x stands for %2∀x%41%*_._._._x%4n%1.
We don't use the full generality in this paper.

.skip 1
	There are two differences between this and (McCarthy 1980).  First,
in that paper  ⊗E(P,x)  had the specific form  ⊗P(x).  Here we speak of
circumscribing a wff, while there we spoke of circumscribing a predicate.
Second, in ({eq a1}) we use an explicit quantifier for the predicate
variable  ⊗P' whereas in (McCarthy 1980), the formula was a schema.
One advantage of the present formalism is that now  ⊗A'(P)  is the
same kind of entity as  ⊗A(P) and can be used as the axiom for circumscribing
some other wff.

.skip 2
#. Minimizing abnormality

	Many people have imagined representing facts about what is
"normally" the case.  One problem is that every object is abnormal
in some way, and we want to allow some aspects of the object to be
abnormal and still conclude the normality of the rest.
We do this with a predicate  ⊗ab standing for "abnormal".  We circumscribe
%2ab_x%1.  The argument of ⊗ab will be some aspect of the entities
involved.  Some aspects can be abnormal without affecting others.  The
aspects themselves are abstract entities, and their unintuitiveness is
somewhat a blemish on the theory.  Next we have applications.

.skip 2
#. The unique names hypothesis

	Raymond Reiter (1979) introduced the phrase "unique names hypothesis"
for the assumption that distinct names denote distinct objects.  We want
to show how to handle this in a more general way by circumscription.

	Suppose we introduce names as objects, i.e. we introduce
⊗'John as a name for John.  For brevity we use the Lisp notation
⊗'John instead of the more usual "John".  We then assume an infinity
of first order axioms like

	%2'John ≠ 'Henry%1,

i.e. for every pair of distinct names there is such an axiom.  From the
point of view of mathematical logic, there is no harm in having
an infinity of such axioms.  From the computational point of view of
a theorem proving or problem solving program, we
merely suppose that we rely on the computer to generate the assertion
that two names are distinct whenever this is required, since
a subroutine can easily tell whether two strings are the same.
We might suppose that some metamathematical considerations would
change from using an infinity of axioms, but usually this won't
be so, because we can generate an infinity of provably distinct
objects from a finite set of axioms.  All we need do is use
integers as names, axiomatize > as a relation including its
transitivity and have an axiom %2∀x_y.x_>_y_⊃_x_≠_y%1.  It is equally
feasible to finitely axiomatize strings in a way that ensures that
distinct strings are provably distinct.

	One way of getting unique names is to use axioms like

!!a2:	%2name(John) = 'John%1

and

	%2name(Henry) = 'Henry%1.

The distinctness of names and the theory of inequality will then
give %2John_≠_Henry%1.

	However, this is too absolute for many applications.  It is
often preferable that objects denoted by distinct
names are to be presumed distinct unless there is reason to believe
them to be the same.  We can do this as follows.  Instead of
using axioms like ({eq a2}), we write

!!a3:	%2denotation('John) = John%1

and

	%2denotation('Henry) = Henry%1

Nothing now prevents  %2John = Henry%1, so we use the axiom

!!a4:	%2∀n1 n2. n1 ≠ n2 ∧ ¬ab aspecteq(n1,n2) ⊃ denotation n1 ≠ denotation n2%1.

	The idea of this axiom is that if ⊗n1 and ⊗n2 are distinct names,
and ⊗aspecteq(n1,n2) is not abnormal, i.e. does not satisfy the predicate
⊗ab, then ⊗n1 and ⊗n2 denote distinct objects.  When we have "taken into
account" all the facts we are going to take into account we will circumscribe
the expression  %2ab_x%1.  This will make %2ab_aspecteq(n1,n2)%1 false
unless some facts prevent it and hence will allow inferring that
 %2denotation_n1%1 and
%2denotation_n2%1 are distinct unless there is evidence that they are equal.

.skip 1
#. Some other examples of Reiter

	Reiter asks about representing, "Quakers are normally pacifists
and Republicans are normally non-pacificists.  How about Nixon, who is
both a Quaker and a Republican?"

We use

!!c1:	%2∀x.quaker x ∧ ¬ab aspect1 x ⊃ pacifist x%1,

!!c2:	%2∀x.republican x ∧ ¬ab aspect2 x ⊃ ¬ pacifist x%1

and

!!c3:	%2quaker Nixon ∧ republican Nixon%1.

	When we circumscribe ⊗ab_z using these three sentences as
⊗A(ab), we will only be able to conclude that Nixon is either abnormal
in ⊗aspect1 or in ⊗aspect2, and we will not be able to say whether he
is a pacifist.

	Reiter's second example is that a person normally lives in the same
city as his wife and in the same city as his employer.  But A's wife
lives in Vancouver and A's employer is in Toronto.
We write

!!c4:	%2∀x.¬ab aspect1 x ⊃ city x = city wife x%1

and

!!c5:	%2∀x.¬ab aspect2 x ⊃ city x = city employer x%1.

If we have

!!c6:	%2city wife A = Vancouver ∧ city employer A = Toronto ∧ Toronto ≠ Vancouver%1,

we will again only be able to conclude that A lives either in Toronto
or Vancouver.

.skip 2
#. Whether birds can fly

	I have explored many ways of non-monotonically axiomatizing the
facts about which birds can fly.  The following set of axioms using ⊗ab seems
to me quite straightforward.

!!a4a:	%2∀x.¬ab aspect1 x ⊃ ¬flies x%1.
Unless an object is abnormal in ⊗aspect1, it can't fly.

	Note that it wouldn't work to write %2ab x%1 instead of %2ab aspect1 x%1,
because we don't want a bird that is abnormal with respect to its ability
to fly to be automatically abnormal in other respects.  Using aspects limits
the effects of proofs of abnormality.

!!a5:	%2∀x.bird x ⊃ ab aspect1 x%1.
A bird is abnormal in ⊗aspect1, so ({eq a4a}) can't be used to show it can't fly.
If this axiom were omitted, when we did the circumscription we would
only be able to infer a disjunction.  Either a bird is abnormal in %2aspect1%1
or it can fly unless it is abnormal in ⊗aspect2.  ({eq a5}) establishes
expresses our preference for inferring that a bird is abnormal in ⊗aspect1 
rather than ⊗aspect2. 

!!a6:	%2∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x%1.
Unless a bird is abnormal in ⊗aspect2, it can fly.

!!a7:	%2∀x.ostrich x ⊃ ab aspect2 x%1.
Ostriches are abnormal in ⊗aspect2.  This doesn't say that an ostrich cannot
fly - merely that ({eq a6}) can't be used to infer that it does.

!!a8:	%2∀x.penguin x ⊃ ab aspect2 x%1.
Penguins are also abnormal in ⊗aspect2. 

!!a9:	%2∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x%1.

!!a10:	%2∀x.penguin x ∧¬ab aspect4 x ⊃ ¬flies x%1.

Normally ostriches and penguins can't fly.  However, there is
an out.  ({eq a9}) and ({eq a10}) provide that under unspecified conditions,
an ostrich or penguin might fly after all.  If we give no
such conditions, we will conclude that an ostrich or penguin
can't fly.  Additional objects that can fly may be specified.
Each needs two axioms.  The first says that it is abnormal in
⊗aspect1 and prevents ({eq a3}) from being used to say that it can't
fly.  The second provides that it can fly unless it is abnormal
in yet another way.  Likewise additional non-flying birds can be
provided for.

	We haven't yet said that ostriches and penguins are birds,
so let's do that and throw in that canaries are birds also.

!!a11:	%2∀x.ostrich x ⊃ bird x%1

!!a12:	%2∀x.penguin x ⊃ bird x%1

!!a13:	%2∀x.canary x ∧ ¬ab aspect5 x ⊃ bird x%1
We threw in ⊗aspect5 just in case one wanted to use the term canary
in the sense of a 1930s gangster movie.

Asserting that ostriches, penguins and canaries are birds will help
inherit other properties from the class of birds.  For example, we have

!!a14:	%2∀x. bird x ∧ ¬ab aspect6 x ⊃ feathered x%1.

So far there is nothing to prevent ostriches, penguins and canaries
from overlapping.  We could write disjointness axioms like

!!a15: %2∀x. ¬ostrich x ∨ ¬penguin x%1,

but we would require  %2n%52%1  of them if we have  ⊗n  species.
We need to do something like the unique names circumscription again.

Therefore, we write axioms

!!a16:	%2∀x. ostrich x ≡ belongs(x,'ostrich)%1

and similarly for penguins, canaries and birds.
We can now write

!!a17:	%2∀n1 n2. n1 ≠ n2 ∧ ¬ab aspect8(n1,n2) ⊃ ∀x.¬(belongs(x,n1) ∧ belongs(x,n2)%1.

This doesn't do everything we want, because we have allowed a canary  ⊗x 
two choices.  It can be a bird avoiding %2ab_aspect5_x%1 or it can be
a non-bird avoiding %2aspect8('canary,'bird)%1.  We want the former to
have priority and can do it simply by writing

!!a18:	%2ab aspect8('canary,bird)%1.

	In what follows we will need uniqueness axioms for the aspects,
and they are a nuisance to write.  They could be written

!!a18a:	%2∀n1 n2 x1 x2.aspect(n1,x1) = aspect(n2,x2) ≡ n1 = n2 ∧ x1 = x2%1,

if we make ⊗n a parameter, i.e. write ⊗aspect(2,x) instead of ⊗aspect2_x, but
even this doesn't to everything we need, because it doesn't cover aspects
that take more than one argument.  Aspects with many arguments can be reduced to
one argument aspects by assuming them to operate on tuples.  For this paper,
we won't make this change of notation and merely assume that all aspects
are distinct objects.

.skip 2
#. General considerations

	Suppose we have a data base of facts axiomatized
by a formalism involving the predicate ⊗ab.  In connection with a particular
problem, a program takes a subcollection of these facts together with
the specific facts of the problem and the circumscribes ⊗ab x.
We get a second order formula, and in general, as the natural number
example of (McCarthy 1980) shows,
 this formula is not
equivalent to any first order formula.  However, many common sense domains
are axiomatizable in such a way that the circumscription is equivalent
to a first order formula.  For example, Vladimir Lifschitz (1983) has
shown that this is true if the axioms are a universal formula and there
are no functions.  This can presumably be extended to the case when
the ranges and domains of the functions are disjoint, so that there
is no way of generating an infinity of elements.

	We can then regard the process of deciding what facts to take
into account and then circumscribing as a process of compiling from
a slightly higher level non-monotonic language into mathematical
logic, especially first order logic.  We can also regard natural
language as higher level than logic, although, as I shall discuss
elsewhere, natural language doesn't have an independent reasoning
process, because most natural language inferences involve suppressed
premisses which are not represented in natural language in the minds
of the people doing the reasoning.

.skip 2
#. An example of doing the circumscription.

	In order to keep the example short we will take into account
only the following facts from the previous section.

{eq a4a})	%2∀x.¬ab aspect1 x ⊃ ¬flies x%1.

{eq a5})	%2∀x.bird x ⊃ ab aspect1 x%1.

{eq a6})	%2∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x%1.

{eq a7})	%2∀x.ostrich x ⊃ ab aspect2 x%1.

{eq a9})	%2∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x%1.

	Their conjunction is taken as %2A(ab,flies)%1.  The
circumscription formula ⊗A'(ab,flies) is then

!!a19:	%2A(ab,flies)
	 ∧ ∀ab' flies'.[A(ab',flies') ∧ [∀x. ab' x ⊃ ab x] ⊃ [∀x. ab x ≡ ab' x]]%1,

which spelled out becomes

!!a20:	%2[∀x.¬ab aspect1 x ⊃ ¬flies x] ∧ [∀x.bird x ⊃ ab aspect1 x]
∧ [∀x.bird x ∧ ¬ab aspect2 x ⊃ flies x] ∧ [∀x.ostrich x ⊃ ab aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab aspect3 x ⊃ ¬flies x]
  ∧ ∀ab' flies'.[[∀x.¬ab' aspect1 x ⊃ ¬flies' x] ∧ [∀x.bird x ⊃ ab' aspect1 x]
∧ [∀x.bird x ∧ ¬ab' aspect2 x ⊃ flies' x] ∧ [∀x.ostrich x ⊃ ab' aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab' aspect3 x ⊃ ¬flies' x]
  ∧ [∀x.ab' x ⊃ ab x]
   ⊃ [∀x.ab x ≡ ab' x]]%1.

	⊗A(ab,flies) is guaranteed to be true, because it is part of what
is assumed in our common sense data base.  Therfore ({eq a20}) reduces
to

!!a20a:	%2∀ab' flies'.[[∀x.¬ab' aspect1 x ⊃ ¬flies' x] ∧ [∀x.bird x ⊃ ab' aspect1 x]
∧ [∀x.bird x ∧ ¬ab' aspect2 x ⊃ flies' x] ∧ [∀x.ostrich x ⊃ ab' aspect2 x]
∧ [∀x.ostrich x ∧ ¬ab' aspect3 x ⊃ ¬flies' x]
  ∧ [∀x.ab' x ⊃ ab x]
   ⊃ [∀x.ab x ≡ ab' x]]%1.

	Our objective is now to make suitable substitutions for ⊗ab' and
⊗flies' so that all the terms except the last of ({eq a20a}) will be true,
and the last term will determine ⊗ab.  The axiom ⊗A(ab,flies) will then
determine ⊗flies, i.e. we will know what the fliers are.
⊗flies' is easy, because we need only apply wishful
thinking; we want the fliers to just those birds that aren't ostriches.
Therefore, we put

!!a21:	%2flies' x ≡ bird x ∧ ¬ostrich x%1.

	⊗ab' isn't really much more difficult, but there is a notational
problem.  We define

!!a22:	%2ab' z ≡ [∃x.bird x ∧ z = aspect1 x] ∨ [∃x.ostrich x ∧ z = aspect2 x]%1,

which covers the cases we want to be abnormal.

[This proof isn't finished in this draft].

.skip 1
#. The blocks world

	The folowing set of "situation calculus" axioms solves the
frame problem for a blocks world in which blocks can be moved and painted.

!!b1:	%2∀x e s.¬ab aspect1(x,e,s) ⊃ location(x,result(e,s)) = location(x,s)%1.

!!b2:	%2∀x e s.¬ab aspect2(x,e,s) ⊃ color(x,result(e,s)) = color(x,s)%1.

Objects change their locations and colors only for a reason.

!!b3:	%2∀x l s.ab aspect1(x,move(x,l),s)%1

and

	%2∀x l s.¬ab aspect3(x,l,s) ⊃ location(x,result(move(x,l),s)) = l%1.

!!b4:	%2∀x c s.ab aspect2(x,paint(x,c),s)%1

and

	%2∀x c s.¬ab aspect4(x,c,s) ⊃ color(x,result(paint(x,c),s)) = c%1.

Objects change their locations when moved and their colors when painted.

!!b5:	%2∀x l s.¬clear(top x,s) ∨ ¬clear(l,s) ∨ tooheavy x
 ⊃ ab aspect3(x,move(x,l),s)%1.

This prevents the
rule ({eq b3}) from being used to infer that an object will move
if it isn't clear or to a destination that isn't clear
or if the object is too heavy.

!!b6:	%2∀l s.clear(l,s) ≡ ¬∃x.(¬trivial x ∧ location(x,s) = l)%1.

A location is clear if all the objects there are trivial.

!!b7:	%2∀x.¬ab aspect7 x ⊃ ¬trivial x%1.

Trivial objects are abnormal in %2aspect7%1.